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[57] ABSTRACT 

The present invention is directed to a system and method for 
the distributed management of the storage space and data on 
a networked con^Miter system wherein the networked comt- 
puter system indudes at least two storage devices for storing 
data files comprised of one or more binary objects. The 
distributed storage management system indudes a device for 
selectivdy copying die binary objects stared on one of the 
storage devices to another of ^e stcsage devices and another 
device for calculating a current value for a binary object 
identifier for sdected binary objects stored on the storage 
devices wherein the calculation of the binary object identi- 
fier is based upon the actual data contents of Ifae associated 
binary object The distributed storage management system 
further indudes a device for storing the current value of the 
binary object identifier as a previous value of Ae binary 
object identifier, ano^er device for con^>aring the current 
value of the binary object identifier associated with a par- 
ticular binary object to one or more previous values of the 
binary object identifier associated with that particular binary 
object and a device for commanding the device for selec- 
tively copying binary objects in response to the device for 
con^aring. 

18 Claims, 14 Drawing Sheets 
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SYSTEM AND METTHOD FOR DISTRIBUTED netv^oiked con^iiter system wherein the networked com- 

ST ORAGE MANAGEMENT ON putcr system includes atleast two storage devices for storing 

NETWORK!!) COMPUTER S YSTEM S data files con^nised of one or more binary objects. The 

USING BINARY OBJECT IDENTIFIERS distributed storage management system includes means for 

5 selectively copying the binary objects stored on one of the 

This application is a continuation application Set. No. storage devices to another of tiie storage devices and means 

08/085^96 filed on JuL 1, 1993, now abandoned ^oi calculating a current value for a binary object identifier 

for sdected binary objects stared on the storaKe devices 

BACKGROUND OF THE INVENHON therein the calculation of the binary object i^er is 

1, Held of the Invention based upon the actual data contents of the associated binary 
The present invention is directed generally to a system object The distributed storage management system further 

and method for distributed storage management on a net- includes means for storing the current value of the binary 

WGorked coixq>uter system and, more specifically, to a system object identifier as a previous value of the binary object 

and method for distributed storage management on a net- id^tifier, means for comparing the current value of the 

worked con^)uter system induding a remote badbip file ^5 binary object identifier associated with a particular binary 

server and one or mose local area networks in oommunica- object to one or more previous values of the binary object 

tion widi the remote bacb^ file server. identifier associated vsdth that particular binary object and 

2. Deso^tion of the Background of tiie Invention means fa- commanding the means for selectively copying 
Backup copies of information stored on a computer binary objects in response to die means for comparing. 

system iraist be made so that if a failure occurs which causes 20 fffesent invention is further directed to a method for 

the original copies of the data to be lost, the lost data can be management of flie storage space and data on a con^Miter 

recovered as it existed at the time when the last badmp copy system wh«nein the con4)uter system indjjdcs at least two 

was made. Badoip/restoie systems have a long history on aU storage area foe storing data files ccsi^msed of one or more 

types of computer systems from mainframes to binary objects* The storage space management method 

minicon^uterSj local area netwcffkfile servers and desktop 25 ii^cludes the following steps: (1) selectively copying the 

workstations. binary objects stored in one of the storage areas to another 

Historically, backup systems have operated by making *® storage areas; (2) calculating a current value for a 

copies of a con^Hiter system^s files on a special backup binary object identifier for selected binary objects stored in 

input/output device such as a magnetic tape drive, floppy the storage areas wherein the calculation of the binary object 

diskette drive, or optical disk drive. Most systems allow full 30 identifier is based upon the actual data contents of the 

backup, partial backup (e.g., specified drives, directories, or associated binary object; (3) storing the current value of the 

files), or incremental backups based on files changed after a binary object identifier as a previous vahie of die binary 

certain date or tima Copies of files made during a backup object identifier; (4) con^aring the current value of the 

pffoccdure are stored on these special backup devices and are ^^1^ identifier associated with a particular binary 

dicn later retrieved during a restore operation either under 35 object to one or mrae previous values of the binary object 

file names doived from the origmal file, ftom the dateAtnne identifier associated with that particular binary object; and 

of the backup operation or from a serially-incremented (5) controlling the step for selectively copying binary 

number. The backup procedure is typically acconq)lished on objects in response to the step for comparing, 

an individual con:q}utei/file server basis, rather than through The system and method, of the present invention for the 

a single coordinated approach enoxapassing multiple sys- 40 management of the storage space on a computer system 

terns. That is, the computer resources of two coir^uters at provide a backup/restore system that is c^ble of operating 

most (the one processing the files to be backed up and the on a networked con^mter system incorporating various 

one \ndth the backup device attadied) are employed to effect types of computers and operating systems, is capsible oi 

the backup process, regardless of the actual number of accommodating a large array of large cq>acity storage 

corr^uters netwaked together. 45 devices, is reliable, is capable of operating with a minimum 

Today, the absolute numbers of conqmters networked amount of human intervention and is reOativdy inexpensive, 

together by organizations are increasing rapidly as is fee Tliesc and other advantages and benefits of the present 

number of different types of computers and operating sys- invention will become q>parent firom the description of a 

tems in use. At the same time, the number of stOTage devices preferred embodiment hKanbdow. 

and the opacities incorporated into eadi of these units is 50 BRIEF DESCRIPTION OF THE DRAWINGS 

r«tore approadiw which have been tradidonany used have ^ cmbodinint wfll now be 

becan^ less «^le, loo^ expensive, and mere C0DSun.,v of «ample only, with refcrena to the 

hye of hmimtmie and attention. „ accompanying figures wherdn: 

Tlius the need easts for a system designed to overcome 55 ^ ^ ^ representation of a net- 

ttie li"ons of the e«stii« backi^p*estaie systems that ^^fed computer systemTwhichthTsystem and method of 

have the foUowmg characteristics: (1) is capable of opetat- presentinvention may be employed; 

mg on a networked conmuter system incraporatuig vanous JTl - ^, ^ . ... ^ .x: j 

types of computers and^>erating systems;^) is capable of 2 Alustratcs the mmn« m winch flie Distabuted 

iZamJZg a laige^ay of large capacity Storage « Ston«e Manager program of ttie present invention a^^ 

devices; (3) is reKablel (4) is capable of operating with a '5>ace on each of the storage devices illustrated 

minimum amount of human int^ention; and (5) is rda- ^^ll I* „ 

tively inexpensive. ^ illustrates the File Database utihzed by the Dis- 

tributed Storage Manager program of the present iBvention; 

SUMMARY OF THE INVENHON ^ pjQ 4 fliustrates the Backup Queue Database utilized by 

The present invention is directed to a system for the the Distributed Storage Manager program of die present 

distributed management of the storage space and data on a invention; and 
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FIGS. 5a-5/ illQStrate flow charts explaining the operation 
of the Distributed Storage Manage: program of the present 
invention. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENT 

FIG. 1 illustrates a sinq^ed representation of a typical 
networked con^ter system 10 in whidi the system and 
method of the present invention for distributed storage 
management on netwodced computer systems may be 
employed. A remote backup file server 12 is in 
communication, via data path 13, with a wide area network 
14. The wide area network 14 is, in turn, in communication 
wi& a plurality of local area networks 16 via data paths 15. 
Those of ordinary skill in the art will recognize diat any 
number of wide area networks 14 may be in communication 
with remote backup file server 12 and that any number of 
local area netwad:s 16 (from 1 to more than 100) may be in 
communication with each wide area network 14. Those of 
ordinary skill in the art will alsorecognize that die means for 
communication between remote backup file server 12, wide 
area network 14 and local area networks 16 over data paths 
13 and 15 is well known. 

Each local area network 16 includes multiple user work- 
stations 18 and local con^uters 20 each in communication 
with their respective local area network 16 via data paths 17. 
Again, those of ordinary skill in the art will recognize that 
the means for communication between user workstations 18, 
local computers 20 and local area networks 16 via data paths 
17 is well known. The stwage space on each disk drive 19 
on ead) local computer 20 in the networked con^uter 
system 10 is allocated as follows and as is ^own in FIG. 2: 
(1) operating system files 22; (2) a Distributed Storage 
Manager program 24 which embodies the system and 
mi^od of tiie present invention (the operation of which is 
described in detail hexeinbelow); (3) a File Database 25 (the 
structure of which is described in detail hereinbelow); (4) a 
Backup (Jucue Database 26 (the structure of which is 
described in d^ail hereinbelow); (5) local computer data 
files 28; (6) fiec disk space 30 and (7) confessed storage 
liles 32 (created by the Distributed Storage Manager pro- 
gram 24 of the present invention as is e^lained more fdly 
hereinbelow). 

The Distributed Storage Manager program 24 of the 
present invention builds and maintains the File Database 25 
on one of the disk drives 19 on each local computer 20 in the 
networked computer system 10 according to the structure 
illustrated in FIG. 3. The File Database 25 stores information 
relating to each file that has been backed up by the Distrib- 
uted Stomge Manager program 24 since the initialization of 
that program on each local counter 20. The File Database 
25 is comprised of three levels of records organized accord- 
ing to a predefined hierarchy. The top level record, Hie 
Identification Record 34, includes identification information 
for each file that has been backed up by Distributed Storage 
Manager program 24. File Identification Record 34 contains 
the followii^ elements: (1) Record T^pe 36 (identifies the 
file as either a directory iile or a regular file); (2) File 
Location 38 (name of the directory in which the file resides); 
(3) File Name 40 (name of the file); (4) Migration Stams 41 
(e:q>lained more fully hereinbelow); and (5) Management 
Class 43 (explained more fiiUy hereinbelow). 

For each File Identification Record 34 in Hie Database 
25, one or more Backup Instance Records 42 are created that 
contain information about the file (identified by File Iden- 
tification Record 34) at the time that the file is backed up. 
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Eadi time that a file is backed up, a Backup Instance Record 
42 is created for tiiat file. Each Backup Instance Record 42 
consists of the following elements: (1) Link to File Identi- 
fication Record 44; (2) Backup Cycle Identifier 46 (identifies 

5 the particular backup cycle during which the Backup 
Instance Record 42 is aeatcd); (3) File Size 48; (4) Last 
Modified Date^ime 50; (5) Last Access Date/Time 52; (6) 
File Attributes 54 (e.g., read-only, system, hidden); (7) 
Delete Date 56 (date on which the file was deleted); and (8) 

]Q Insert Date 57 (date on which the Backup Instance Record 
42 was created). 

Associated with each Backup Instance Record 42 is one 
or more Binary Object Identification Rec(x?ds 58. The Dis- 
tributed Storage Manager pogram 24 views a file as a 

15 collection of data streams. A data stream is defined as a 
distinct collection of data widiin the file that may be chaitged 
independently from other distinct collections of data within 
the file. For exan^le, a file may contain its normal data and 
may also contain es^tended attribute data. A user may change 

20 the extended attribute data without modifying any of the 
normal data or vice versa. The Distributed Storage Manager 
program 24 further divides eadi data stream into one or 
more binary objects. If the size of the data stream is equal 
to or less than a previously defined convenient maximmn 

25 binary object size (currently one (1) megabyte), then a single 
binary object represents the data stream. If the data stream 
is larger than the maximum binary object size, fiien the 
Distributed Storage Manager program 24 divides the data 
stream into mult^le binary objects, all but die last of ^^ch 

30 are equal in size to the maximum binary object size. A 
Binary Object Identification Record 58 is created for each 
binary object that con^jrises the file \!^ch was backed up 
during the backup cycle identified by the Backup Cycle 
Identifier 46 of a particular Backup Instance Record 42. 

35 Each Binary Object Identification Record 58 includes the 
following oxnponents: (1) Link to Backup Instance Record 
60; (2) Binary Object Stream T>pe 62 (e.g., data, extended 
attributes, security); (3) Binary Object Size 64; (4) Binary 
Object CRC32 66 (explained more fully hereinbelow); (5) 

40 Binary Object LRC 68 (explained more fully hereinbelow); 

(6) Binary Object Hash 70 (explained more fully 
hereinbelow); and (7) Binary Object Offset 72 (eaplained 
more fully hereinbelow). The Binary Object Size 64, Binary 
Object CRC32 66, Binary Object LRC 68 and Binary Object 

45 Hash 70 conqjrlse the Binary Object Identifier 74 which is 
a unique identifier for each fainaiy object to be backed up and 
is discussed in more detail below. 

The Distributed Storage Manager program 24 also builds 
and maintflinf! the Backup Qneae Database 26 on one of the 

50 disk drives 19 on each local coiz^ter 20 in the networked 
conq>utcr system 10 according to the structure illustrated in 
FIG. 4. Eadi entry (Backup Queue Record 75) in the Backup 
(^eue Database 26 is comprised of the following compo- 
nents: (1) Record Type 76 identifies the file as eifiier a 

55 directory file or a regular file); (2) File Location 78 (name of 
the directory in which the file resides); (3) File Name 80 
(name of the file); (4) File Status 82 ("nerw", "modified" or 
-dd^"); (5) File Size 84; (6) Last Modified Date/Time 86; 

(7) Last Access Date/lime 88; (8) File Attributes 90 (e.g., 
60 read-only, system, hidden); and (9) File Priority 92 

(e^lained more fully hereinbelow). 

The operation of the Distributed Storage Manager pro- 
gram 24 may be illustrated by way of the flow charts 
depicted in FIGS. 5a through SL For explanation pmposes, 
65 the Distributed Storage Manager program 24 is divided into 
several distinct functions which will be discussed in turn. 
Those of ordinary skill la the art will recognize, however. 
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^ eadb oi the di^ct fonctioos operates in cooperation nes with step IM. If an additional file has not been located, 
wi& the o&er functions to form a unitary coiDputerprogranL prognun contr<d continues with step 116. 
Those of ordinaiy skill in the ait will also leoognize that the In step 104, the Distdbuted Storage Manager program 24 
following discussion illustrates tiie operation of the Distrib- detennines vt^etfaer a Backup Queue Record 75 exists for 
uted Storage Manager program 24 on a sing|le local com- 5 the located file by comparing the file's file block infonnation 
puter 20, althougib it should be understood that the Distrib- to tiie information stored in Backup Queue Database 26. If 
uted Storage Manager program 24 operates in iht same such a Backup Qu^e Record 75 does not exist ^e., this is 
fashion on each local con^>uter 20 on the networked com- the first time this file will be backed i^), program contnd 
puter system 10. The Distributed Stcrage Manager program continues with step 106 where a Backup Queue Record 75 
24 can either be executed on user demand or can be set to lo for tl^ file is created using the information contained within 
exeotte periodically on a user-defined schedule. the file's file block. The FQe Status field 82 for the newly 
1. Identification of Binary Objects to be Backed Up created Backup Queue Record 75 is set to "NEW^. Program 
In the flow chart of FIG. 5a, execution of the Di^ributed control then continues with step 108 v^ere a user-defined 
Storage Manager program 24 begins at stqp 100 where tiie priority is assigned to the file and stored in the Hie Priority 
Backup Queue Database 26 is built by creating a Backup 15 field 9^ of tiie Backup Queue Record 75. This user-defined 
Quelle Record 75 for each File Identiflcation Record 34 priority may be assigned to the file by mctiiods that are 
found in File Database 25. In tills way, a list of files that were weO-lmown to those of ordinary skill in tiie art The use of 
backed up during the previous badoip cycle is estat^hed so the File Priority field by the Distributed Storage Manager 
that it can be determined which files need to be backed up program 24 is discussed in mxxo detail hereinbelow. FVo- 
during tiie current backup cyde. To create each Backup 20 gram control is then returned to step 102. 
Queue Record 75, the Backup Instance Record 42 repre- If the Distributed Storage Manager program 24 
senting the most recent backup of the file represented by determines, in step 104, that a Backi^ Queue Record 75 
each Hie Identification Record 34 is located. This detenni- exists in the Backup Queue Database 26 for the located file, 
nation is made by exanuning tiie Backup Cycle Identifier 46 jn^ogram control continues with step HO where it is deter- 
in each Backup Instance Record 42. The Backup Cyde 2S mined whether any change has been made to the file. This 
Identifier 46 may r^esent either a date (month/day^car) or determination is made by coii^>aring the information in the 
numerical value assigned to a particular backi^ cyde. Ilie file's file block with tiie infomiation stored in tiie file's 
Backup Queoe Record 75 is comprised of certain of tiie data Backup Queue Record 75. If any of the values have dianged, 
fidds of botii the File Identification Record 34 and tiie program control continues with step 112 where File Status 
Backup Instance Record 42. During tiie process of creating 30 field 82 is set to "MODIFIED" and tiie fidds in the Backiq) 
each Backup Queue Record 75, tiie File Stamsfidd 82 is set Queue Record 75 are updated from tiic file's file block 
to "DELETED". However, if the Delete Date field 56 of tiie information. Program control then continues witii step 108 
most recent Backi:^ Instance Record 42 associated with the where a user-defined priority is assigned to the file and 
File Identification Record 34 curreutiy being processed is stored in Hie Priority field 92; program control is then 
non-zero, indicating that tiie file has been previously ddeted, 35 returned to step lOZ If the determination is made in step 110 
then no Badmp Queue Record 75 is created for tiiat File that no change has been made to the file, then, in step 114, 
Identification Record 34. If the backup that is currently tiie Backup Queue Record 75 is deleted from the Backup 
being processed fa: the local computer 20 is not a fiill Queue Database 26 since the file does not need to be ticked 
backup (Le., all files on all disk drives 19 on the local up. Following step 114, program control is returned to st^ 
computer 20), then the Distributed Storage Manager pro-. 40 102. 

gram 24 will only create Backup Queue Records 75 for tiiose If the Distributed Storage Manager program 24 

files tiiat match tiie badoip specifications. For example, if determines, in stq> 103, tiiat an additional file has not been 

only those files that have a file extension of ''.HXE^ are to located^ program control continues with step 116. In step 

be backed up, tiien only File Identification Records 34 tiiat 116, tiie Distributed Storage Manager program 24 reads each 

correspond to **.EXE* files will be processed. 45 Backup Queue Record 75 in Backup Queue Database 26, 

Program control then continues with step 102 where the one at a time. The Backup Queue Records 75 in Backup 

Distributed Storage Manager program 24 of tiie present Queue Database 26 represent all of the files tiiat must be 

invention scans all disk drives 19 on the local con:^)ut^ 20 backed up by the DLsttibuted Storage Manager program 24 

that are to be backed up. This operation consists of scanning during the present backup cyde. Program control continues 

the directory hierarchy on each disk drive 19 on the local so with step 117 where the DistritHited Storage Manager pro- 

con^uter 20 and returning to tiie Distributed Storage Man- gram 24 detennines whether a next Backup Queue Record 

ager program 24 certain file block information for eadi of 75 has been located in Backup Queue Database 26. If a next 

the directory files and regular files that are stored on the disk Backup Queue Record 75 has been located, program control 

drives 19 to be backed up. A typical oon^t^ operating continues with step 118; otherwise, program control contin- 

system maintains a file block for each file stcored on the 55 ues with step 119^ where the routine illustrated by the flow 

system which indudes information such as file location, file chart of FIG. 5a is terminated In step 118, the Distributed 

type, user-assigned file name, file sisEe, creation date and Storage Manager program 24 determines whetiier the File 

time, modify date and time, access date and time and file Static fidd 82 in the Backup Queue Record 75 (mrentiy 

attributes. This operation may be controlled by some param- being processed is set to "DELETED**. If tiie File Status 

eters that indicate which drives, directories and files are to 60 field 82 is set to *T)FT«FTED", program control continues 

be backed up during a backup operation. However, the with step 120 where the Delete Date field 56 in the most 

default operation is to back up dl files on all disk drives 19 recent Backup Instance Record 42 associated with the file 

on tiie local con^Hitcr 20. Program control then continues identified by tiie Backup Queue Record 75 currently being 

witii step 103 where the Distributed Storage Manager pro- processed is set to the current date. Alist of all Binary Object 

gram 24 determines whether tiie file block information for 65 Identification Records 58 associated with the Backup 

an additiond file has been located on the disk drives 19. If Instance Record 42 for the file identified by the Backup 

an additiond file has been located, program control contm- Qu»ie Record 75 curreutiy being processed is placed in a 
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delete queue (not shown) that will be used by Distributed field 68 and Ac Binary Object Hash field 70. Each of the 

Storage Manager program 24 to delete all Binary Object fields of the Binary Object Identifier 74 may be four (4) 

Identification Records 58 for binary objects that have been bytes in length and is calculated tcom the contents of each 

deleted from fte disk drives 19 of local computer 20. binary object The Binary Object Size field 64 may be set 

Program control then continues with step 122 where the 5 ^ to^c fcytc-sizcoft^^ 

Backup Queue Record 75 currently being processed is field 66 may set equal to die sjandard 32^it 

deleted fi^im the Backup Queue Database 26. Program CycUcal R(^undancy number caloikted^a^ 

control is then returned to step 116. ^ .^'^ T ^n? ^^^^^ ^ 

Tr*K- xifa«««^ «rr.m-a^ timc. Thosc of QTdinaiy sfall m thc BTt wiU icadily recognizc 

If the Distributed f ^^f .^^^ the manner in which &e Cyclical Redundancy Check nma- 

det£aimn^,ms^ll8 toattheF^^^^ lo ^er calculated. Hie Binary Object UtC field 68 may beset 

Backup Queue Record 75 currently bcmg processed is not ^ standard Longitudinal Redundancy Check num- 

set to "DELBTED". program control continues with step bcr calculated against the contents of the binary object taken 

124 where the Distributed Storage Manager program 24 q2 bits) at a time using the foUowing 

determines whether the FHe Status field 82 of the Backup algorithm: 

Queue Record 75 currently being processed is set to 15 

"NEW". If the FQe Status field «2 is set to •mw", foogram ■ 

control continues with step 126 where a Hie Identification binary object lrc - Cmitiaixnd value) 

Record 34 is created in Hie Database 25 using the infer- fiweachdcuHe wwd (32bits) of the irinary ot>iectd^^ 

mation stored in the Backup Quaie Record 75 currently J^=. rAC(XOR)doul,bworf of ttn«y object data 

being processed. Program control then continues with step 20 
130. If the Distributed Storage Manager program 24 

determines, in step 124, diat the File Stams field 82 of the The Binary Object Hash fidd 70 is calculated against (he 

Backup Queue Record 75 currently being processed is not contents of the binary object taken one (1) word (16 bits) at 

set to "NEW" (Le., the file has been modified since the last a tune using die following algorithm: 

backup cyde), program control continues with step 128 25 

where the Rle Identification Record 34 associated with the _ r • * YttA 

file identified by the Backup Queue Record 75 currently ^ ^^^TtA (16 bits) of the t^naiy object: 

being processed is located in the File Database 25. Program rotate cuntot hash value by s bits 

control dien continues with step 130. In st^ 130, the hash = hash+i 

Distributed Storage Manager program 24 creates a new 30 h^ = hash + (cmicnt word (16 bits) of hhwiyoi^ 

Backup Instance Record 42 in the File Database 25 for the ^ 
file identified by the Backup Queue Record 75 currently 

being processed. The Backup Instance Record 42 is aeated Since the Binary Object Identifier 74 is used to uniijuely 

using information stored in the associated File Identification identify a particular binary object, it is iir^>ortant that the 

Record 34 and die Backup Queue Record 75 currently being 35 possibility of two different tnnary objects being assigned the 

processed* The Backup Cycle Identifier 46 is set to indicate same Binary Objert Hentifier 74 be very small. This is the 

that the file is to be backed up during the current backup reason for implementing die Binary C^ject Identifier 74 

cycle. The Ddl^ Date field 56 is initialized to "zero". The using 128 bits and four separate calculations. Although a 

Insert Date field 57 is set to the current date. Binary Otgect Identifier 74 may be calculated in various 

Program cantrol then continues with step 132 where the 40 ways, the key nodon is that die Binary Object Identifier is 

Distributed Stcarage Manage: program 24 separates the file calculated firom the contents of the data instead of from an 

identified by the Backup Queue Record 75 cucrendy being e;rtemal and arbitrary source. By incarporating the Binary 

processed into its component data streams. Each data stream Object Size field 64 within die Binary Object Identifier 74, 

is then processed individually. Those of ordinary skill in the <nil^ binary objects that arc exacdy the same size can 

art will recognize diat diese data streams may represent 45 generate duplicate Binary Object Identifiers 74. Ftirther, die 

regular data, extended attribute data, access control list data, calculations used to determine the Binary Object CRC32 

etc. Program control continues widi step 134 where die field 66, die Binary Object LRC field 68 and the Binary 

Distributed Storage Manager program 24 determines Object Hash field 70 are retattvely indq)endent of each 

whether each of the data streams curr^tly being processed other. Using the calculations set forth above, the probability 

is larger than the mantnimn binary object size (currendy one so that the Distributed Storage Manager program 24 will gen- 

(1) megabyte). If the data stream is larger dian one (1) crate the same Binary Object Identifier 74 for two different 

megabyte, program control continues with step 136 where binary objects is extremely low. Those of ordmary skill in 

the data stream currently being processed is segmented into the art will recognize that there exist many diffaent ways of 

multiple binary objects suiaUer in size than one (1) mega^ establishing the Binary Object Identifier 74 (e.g., establish- 

byte. Either following step 136 or, if the determinadon is 53 ing a Binary Object Identifier 74 of a different length or 

made in step 134 that die data stream currendy being utilizing different calculations) and that the procedure set 

processed is not larg« than one (1) megabyte (and, thus, the forth above is only one way of establishing the Binary 

data stream is represented by a single binary object), pro- Object Identifier 74. The critical feature to be recognized in 

gram control continues with step 13S. creating a Binary Object Identifier 74 is diat die identifier 

In step 138, a Binary Object Identification Record 58 is 60 should be based on die contents of the binary object so that 

created in File Database 25 for each of die binary objects the Binary Object Identifier 74 changes when the contents of 

currently being processed. Each of these Binary Object the binary object dianges. In diis way, duplicate binary 

Identification Records 58 are associated with die Backup objects, even if resident on different types of computers in 

Instance Record 42 created in step 130. The Binary Object a heterogeneous network, can be recognized from dieir 

Identifier 74 portion of each Binary Object Identification 65 identical Binary Object Identifiers 74. 

Record 58 is conpised of the Binary Object Size field 64, Program control then continues with step 140 where the 

die Binary Object CRC32 field 66, die Binary Object LRC Distributed Storage Manager program 24 Identifies which 
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hinaiy objects must be backed up during the ciureiit backup simile comjiressed binary objects into a larger unit for 

cycle. If the Hie Status field S2 of the Backup Queue Record storage. This is required to reduce the number of files that 

75 currently being processed is set to **NEW**^ then all the Distributed Stosage Manager program 24 must manage 

hinaiy objects associated with the file identified by tiie and to ensure that the Distributed Storage Manager program 

Backup Queue Record 75 currently being processed must be 5 24 does not create many ^'small'* files since most file systencis 

backed up during the current backup cyde. If the File Status allocate some minimum amount of space to store a fiie even 

field 82 is set to '^MODIFIED", then only those binary if the actual file contains less data than the allocated space, 

objects associated widi the file that have changed must be The purpose bdiind storing the backup copy of a binary 

backed up. Those hinaiy objects that have changed aie abject on a disk drive 19 on a different local computer 20 is 

identified by comparing the Binaiy Object Identifiers 74 lo to ensure that if the first disk drive 19 or local ccaq)utei 20 

calculated in step 138 with the cocresponding Binaiy Object fails, the backup copies of the binary objects are not lost 

Identifiers 74 associated with the next most recent Backup along with fiie original copies of the binaiy objects. 

Iistance Record 42 for the file identified by the Backup Program control then continues with step 208 where eadi 

Queue Record 75 currently being processed. The Binary conqjressed storage file 32, when it reaches a maxiinuTn 

Object Identifiers 74 calculated in step 138 are con^>ared 15 manageable size (e.g., two (2) megabytes), is transmitted to 

against their counterparts in the File Database 25 (e.g., tiie the remote backup file servear 12 (FIG. 1) over wide area 

Binary Object Identifier 74 (as calculated in step 138) that network 14 for long-term storage and ictrievaL Upon amval 

identifies the first binary object in the file (as determined by of the compressed storage file 32 at the remote backup file 

the Binary Object Stream Type field €2 and the Binaiy server 12, software resident on the remote backup file server 

Object Offset field 72) is coiiq>ared to the Binary Object 20 12 routes die compressed storage file 32 for ultimate storage 

Id^tifier 74 (associated with the next most recent Badoip to magnetic t^ or other low cost storage media. The 

Instance Record 42) fcs the first binary object in the file). backup copy of a binary object whk^ is maintained in the 

This procedure allows the Distributed Storage Manager con^xressed storage file 32 on one of the disk drives 19 on 

program 24 to diamine which parts of a file have changed one of tiie local coii^)Uters 20 is only the most recent version 

and only bade up the changed data instead of backing up all 25 of eadi binaiy object that is backed up while the backup 

of the data associated with a file when only a small portion copy of fiie binaiy object stored on the r^ote backup file 

of the file has been modified. Program control is then server 12 is kept until it is no longer needed. Since most 

returned to step 116. restores of files on a local area network 16 consist of 

2. Concurrent Onsite/Offsite Backup requests to restore the inost recent backup version of a file, 

The Distributed Storage Manager program 24 peribnns 30 the local copies of binary objects serve to handle very fast 

two concurrent backup operations. In most cases, the Dis- restores for most restore requests that occur on the local area 

tiibuted Storage Manager program 24 stores a compressed netwcik 16. If the local backup c<^ of a file does not exist 

cc^y of every binary object it would need to restore every or a prior version of a file is required, it must be restored 

disk drive 19 on every local coiiq)uter 20 somewhere on the from the remote backup file server 12. Program control then 

local area network 16 other than on the local computer 20 on 35 continues with step 210 where the Distributed Storage 

which it normally resides. At the same time, die Distributed Manager parogram 24 determines whether sufi&dent space is 

Storage Manager program 24 transmits every new or available in the space allocated for compressed storage files 

dianged binary object to the remote backup file server 12. 32 on the disk drives 19 on local oon[9)Uters 20 for storage 

Binaiy objects that are available in con^jresscd form on the of the binary object currently being processed. If sufficient 

local area network 16 can be restored very quickly while die 40 space is available, program control is returned to step 200. 

much greater storage c^>aci^ on the remote backup file Otherwise, fiie binary object currently being processed is 

server 12 ensures that at least one copy of every binary deleted firom the disk drive 19 on which it was stored after 

object is stored and that a disaster that destroys an entire site transmission to the remote backup file server 12 has been 

would not destroy all copies of that site's data. con^leted. Program control is then returned to step 200. 

The Concurrent Onsite/Offsite Backup routine begins at 45 3. Pile Prioritization 

step 200 of the flow chart illustcated in FIG. Sh where the The file prioritisation process pcdbrmed by the IMstiib- 

Distributed Storage Manager program 24 con^iles a list of uted Manager Storage program 24 is handled by four 

tiiose binaiy objects that are to be backed up during the interrelated routines of that program: (1) Backup/Restore 

current backup cyde. Those binary objects whidi must be Routine; (2) Conpression Routine; (3) Local Storage Rou- 

backed up during the cmrent backup cycle aie identified in 50 tine; and (4) Resource Allocation Routine. Each routine will 

step 140 of the flow chart of FIG. 5a. Those of oxdinaiy skill be described in turn. In the following discussion, when one 

in the art will recognize, however, that the Concmrent of die four routines is discussed, it should be understood that 

Onsite/Offsite Backup routine may be performed indepen- it is the JMstributed Storage Manager program 24 that is 

dently of the routme illustrated in FEG. So. Program control executing the functions of that routme. Hie Backup/Restore 

then continues with step 202 wiicre the Distributed Storage 55 Routine, the Local Storage Routine and the Com^ssion 

Manager program 24 identifies whether there are any addi- Routine may be executed on each of the local computers 20 

tional tnnary objects to be processed. If no additional binaiy on the networked computer system 10 while the Resource 

objects are to be processed, program control is transf^ed to Allocation Routine is executed on only one of the local 

step 204 where the Concurrent Onsite/OfEisite Backup ion- con^uters 20 on the networked con^uter system 10. This 

thie is terminated. Otherwise, program control continues 60 execution scheme permits the resources of any available 

with step 206 where ttc binary object currently being localcomputer20onany of the local area networks 16 to be 

processed is conpressed and stored in a con^jressed stoage utilized according to its availability. Rirthermore, more than 

file 32 (FIG. 2) on one of the disk drives 19 on a local one local con:4>uter 20 may be utilized to complete any 

computer 20 on the local area network 16 other than flie hi^^ority tasks required to be completed wifliin a sped- 

local conqxiter 20 on which the binary object is currently 65 fiedtime£rame.An advantage of the process of prioritization 

stored. Hie compressed storage file 32 is used to allow the of files is that it allows the Distributed Storage Manager 

Distributed Storage Manager program 24 to pack several program 24 to effectively deal with a situation where local 
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storage and wide area netw<Hk transmission resources are fcom Identification Record 34, the Binary Object Stream 
limited. The Distributed Stcffage Manager program 24 is also Type field 62 from the Binary Object Identification Record 
able to keep track of data \^di is not stored locaUy or 58 and die Binary Object Offset field 72 from the Bmary 
transmitted to the r^note backup file server 12 in any given Object Identification Record 58 for eadi binary object to be 
badflip cyde and then attempt to resume these processes 5 coinpressed. The binary objects are placed ma conqjrcs&ion 
during die next backup cyde. queue (not shown) for processing by the Conqxres^on 

The Backup^estorc Routine is iUustrated in the flow Routine. Program control then continues with step 316 
diart shown in nO. 5c. In stqp 300, the Distributed Storage where die Compression Routine sends a mcssa^ to Ae 
Manager program 24 initiates die Backup^^cstore Routine Resource Allocation Routine to dctcnninc whidi Local 
by locating the highest priority binary object scheduled for 10 Storage Routine has space available for storage of com- 
teckup on die local computrr 20 on which die Badoip/ pressed binary objects. Program control then continues widi 
Restore Routine is executing. The identities of die binary step 318 whact die Compression Routine requests allocation 
objects to be backed up and tiior respective priorities woe of a con^ircsscd storage file 32 from the Local Storage 
dctcnnincd by die Distributed Storage Mana^ program 24 Routine tiiat has indicated availabiHty of storage ^pace. 
in die flow diart of no. Those of ordinary skill in die 15 Program control continues witii step 320 where die bmary 
art will recognize, however, diat die file prioritization r€Hi- object is comiffessed and stored in die aUocated cwi^iressj^ 
tines of die Distributed Storage Manage program 24 may be storage file 3Z Program control dien ccmtinues widi step 322 
utilized independendy of the process illustrated in die flow where die Compression Routine determines whrtto diere 
chart of FIG. Sc. Program control dien continues widi step are more Wnary objects in die compress queue. If diere are 
302 where die Backup/Restore Routine of die Distributed 20 no more binary objects present in die oonqxress queue, 
Storage Manager program 24 determines whedicr a binary program control returns to step 312. If more Wnaiy objects 
object to be backed up has been located. If not, jffogram are i^esent, program control continues witii step 324 where 
control continues widi step 304 where die Bada^/Restore die Compression Routine determines whedier die alloc^ 
Routine is tenninated Odicxwise, program contrd continues compressed storage file 32 is full If not, program control is 
widi step 306 where die Backup^estore Routine of die 25 returned to step 320. Otiicrwise, program contrd is returned 
Distributed Storage Manager program 24 sends a message to to stq) 316 where die Compression Routine sends a message 
die Resource AUocaticm Routine indicating die priority of to die Resource AUocation Routine to determine which 
die highest priority binary object diat die Backup/Restore Local Storage Routine has space available for storage of 
Routine has located Program control tiien continues witii compressed binary objects. 

step 308 where die Backup/Restore Routine waits for a 30 The Local Storage Routine executed by die Distributed 
message from die Resource AUocation Routine indicating Sta^e Manager program 24 is illustrated m die flow chart 
which ConmiessiCMi Routine is available to conqiress and depicted in HG. 5e. The Local Starve Routine is respon- 
store die highest priority binary object located by die sible for managing storage file space on a particular Ic^ 
Backup/Restore Routine. In tins way, die Distributed Stor- computer 20. Program control beings at step 326 where die 
age Manager mogram 24 is able to perfwm not only local 35 Local Storage Routine of Distributed Storage Manager 
conmuter 20 based file pri(Hitization but also networfced program 24 sends a message to die Resource Allocation 
conmuter system 10 based file prioritization. This is accom. Routine indicating die amount of storage space it has 
plished by having die Resource Allocation Routine examine available for allocation of compressed storage files 32. The 
die priority of die highest priority binary object located by Local Storage Routine determines die amount of J^^e tos 
cadi Badoip/Rcstorc Routine and then allocating conqjies- 40 available for allocation of compressed storage files 32 by 
sion resources to die Backi^estorc Routine which has die deieamining die total amount of free space on its disk dtives 
highest priority binary object to con^ircss. 19 and tiien detennining how must space must be left as 

Program control continues witii st^ 310 whwe die "free space". The amount of required **free space is user- 
Backup/Restore Routiuie receives a message from die specified. Ftogram control continues widi step 328 where 
Res<Hirce AUocation Routine indicating tiiat a Conqsression 45 die Local Storage Routine waits for a request from a 
Routine is available for binary object con^iression. Ihe Con^aon Routine for allocation of a conqjressed storage 
Backup/Restore Routine dien sends a list of up to forty (40) file 32. Upon receipt of such a request, program control 
binary objects ci up to one (1) m^abyte of uncompressed continues widi step 330 where die requested conqiressed 
binary objects to die Conqxression Routine starting widi die, stwage file 32 (e.g., two (2) megabytes in size) is aUocated. 
highest priority binary object tiiat die BackupOiestore rou- 50 The Local Storage Routine tiien returns a message to die 
tinehasldentifiedfarbackup.Thereasontolimitdienuniber requesting Conqxression Routine indicating die name and 
or size of binary objects diat are sent to a Conqiression location of die compressed storage file 32 diat has been 
Routine is to aUow tiie Canqiression Routine to workfor a aUocated. Program contrcA is tiicn returned to step 326. 
Kmitcd amount of time on tiie binary objects for one The Resource AUocation Routine pof ormed by die Dis- 
Backup/Restore Routine before becoming available to WQii: 53 tributed Stcnagc Manager program 24 of die present inven- 
on anodicr Backup/Restore Routine's Wnary objects. tion is depicted in die flow chart of HG. 5/ The Resource 

The Compression Routine performed by die Distributed Allocation Routine is a process diat responds to messages 
Storage Manager program 24 is iUustrated in die flow chart from odier routines of die Distributed Storage Manager 
depicted in FIG. Sd, Program control begins at stq) 312 program 24 and aUocates resources between resource 
where die Compression Routine of die Distributed Stwage 60 requesters and resource providers. Program control be^s 
Manager program 24 sends a message to die Resource witii step 332 where the Resource Allocation Routme 
AUocation routine indicating diat die Compression Routine executed by die Distributed Storage Manager program 24 
is avaUable to conp-ess binary objects. Program control waits for a message from a Distributed Storage Manager 
tiien continues widi step 314 where die Compression Rou- {ffogram 24 routine. When a message is received, program 
tine waits for a compress message from a Backup/Restore &5 control continues widi step 334 where die Resource AUo- 
Routine indicating whidi binary objects are to be comr cation Routine determines whedier die message is from a 
pressed. The compress message indudes die Ffle Name 40 Backup/Restore Routine transmitting information relating to 
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its highest pnority binary object for coiiq;>rcssioa. If sudi a "space request list**, program coBtrol continues with step 
message is leceived, program control continues with step 356 where the Resource Allocation Routine determines 
396 where the Resource Allocation Routine stores this whether any oi &e Local Storage Processors has any avatl- 
information in an internal table containing Backup/Restore able space by examining the inf onnation in its internal table 
Routine status infonuation. The Resource Allocation Rou- 5 containing Local Storage Routine information. When a 
tine then scans this status information table to ascertain Coii^nsssion Routine requests a con^iressed storage file 32, 
\^ch Backup/Restore Routine has the highest priori^ the Compression Routine identifies the name of the local 
binary object for storage. Program control then continues con:q>utcr 20 on which the Backup^stoie Routine is 
with step 338 ^^liere the Resource Allocation Routine deter- executing and on whose behalf it is compressing binary 
mines whe&er any Conqffession Routine is available to lO objects. This allows tiie Conqression Routine to request a 
process the highest priority binary object If no Conqnession compressed storage file 32 on a local computer 20 other than 
Routine is available far processing, program control is the local compter 20 cm which the binary object to be stored 
returned to step 33Z If an available Con^nression Routine is resides. Otherwise, the backup copy of tbt binary object 
located, jvogram control continues with step 340 where the may be ^ored on the same local conq[>uter 20 as the original 
Resource Allocation Routine transmits a message to the is binary object whereby a disk drive faihire would rcsuh in 
requesting Backup/Restore Routine indicating whidi Com- losing bodi the odginal and backup copies. The Resource 
prcssion Routine is available to conqjress the binary object Allocation Routine uses the informaticm sillied by the 
In addition, the Resource Allocation Routine marks the Con^jression Routine to ensure tiiat the requested oomr 
Conqjression Routine as *VcEking** in an internal table pressed storage file 32 is allocated on a local computer 20 
containing Conqiressian Routine information. Program con- 20 <^ier than the local oon^uter 20 from which the binary 
trol is then returned to step 332. object originated. If available storage space is located. 
If the Resource Allocatton Routine detcmunes, in step program control continues with step 358 where the Resource 
334, that the received message is not from a Backup/Restore Allocation Routine transmits a mess^e to tiie next Corn- 
Routine, program control continues with step 342 v/heic the pression Routine in tiie **space request lisT Indicating which 
Resource Allocation Routine determines whether the 23 Local Storage Routine has allocated an available corn- 
received message is ton a Contqpression Routine indicating pressed storage file 32. Program control is then returned to 
that the transmitting Conq>ression Routine is available for step 354. 

processing. If the received message is from a Compression If the Resource Allocation Routine detexmines, in step 
Routine, program control continues with step 344 where the 356, that no storage space is avaflable, program control 
Resource Allocation Routine marks the transmitting Com^ 30 continues with step 360 where die Resource Allocation 
pression Routine as '^available'* in its internal table contain- Routine determines whether there are any con^ircssed stor- 
ing Con^nression Routine information. Program control then age files 32 that are maintained by any of the Local Storage 
continues with st^ 336. Routines which have a lower pnority that the binary object 
If the Resource Allocation Routine determines, in step currently being processed If so, program centred continues 
342, that tiie received message has not been transmitted fi^om 35 with step 362 where a message is transmitted to Qic Local 
a Compression Routine indicating its availability fox Storage Routines instructing the Local Storage Routines to 
processing, program control continues with step 346 where delete some of die low-priodty compressed storage files 32 
the Resource Allocation Routine determines whether the to make room for higher priority binary objects. After these 
received message is from a Local Storage Routine indicating lowcr-pdority compressed storage files 32 are deleted by the 
the amount of storage space that tiie Local Storage Routine 40 Local Storage Routines, the Local Storage Routines will 
has available. If the received message is from a Local transmit new status messages to the Resource Allocation 
Storage Routine, Ftogram contrcd continues with st^ 348 Routine. Program control is then reuimed to step 332. 1^ no 
where the Resource Allocation Routine locates tiie trans- lower^iriarity confessed storage files 32 are located in step 
mitdng Local Storage Routine in an internal table containing 360, program control continues wilh step 364 where Ihe 
Local Storage Routine information and saves the storage 45 Resource Allocation Routine transmits a message to the 
space information transmitted by the Local Storage Routine. Local StCHfage Routines with instructions that from that time 
Program control then continues with step 354. If the forward, any allocated compressed storage files 32 are to be 
Resource Allocation Routine detemunes, in step 346, diat deleted after the contents of the con^ressed storage files 32 
the received message is not from a Local Storage Routine, have been successfully transmitted to the r^ote backup file 
program control continues with step 350 where the Resource so server 12 for long-term storage. Program control is then 
Allocation Routine deteacmines whether the received mes- returned to st^ 332. 
sage is from a Conipression Routine requesting a com- 4. Granularization of Files 

pressed storage file 32. If the Resource Allocation Routine The most impoitant dass of 'large** files on computer 

determines that such a message was received, program systems such as netwoffk)edco^^>ute^ system 10 is databases, 

control continues with step 352 where die identity of the 55 TypicaHy, on a given day, only a small percentage of the data 

requesting Compression Routine is added to a "space in a large database is dianged by the users of that database, 

request list**. Program control then continues with step 354. However, it is likely that some data wiU be changed in each 

If the Resource Allocation Routine determines, in step 350, one of die (1) megabyte binary object segments that are 

that die received message is not from a Compression Rou- created in step 136 of the flow diart depicted in FIG. 5a. As 

tine requesting a conpressed storage file 32, program con- 60 a result, in most cases, die entire "laigc** database file would 

trol is returned to step 332. have to be badced up to the remote backup fik server IZ 

In step 354, the Resource Allocation Routine determines However, the Distributed Storage Manager program 24 of 

whether any Compression Routines are waiting for alloca- the present invention utilizes a technique of subdividing 

don of a compressed storage file 32 by examining the "space large database files into ^'granules'* and then tracks changes 

request list*". If no such Compression Routines are in die 65 from the previous backup copy at the "granule" level The 

"space request list", program control is returned to step 332. "granule** size utilized by the IMstributed Storage Manager 

If the identity of such a Conqn^ion Routine is found in die prc^ram 24 may be one (1) kilobyte aldiough those of 
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ardinary skiU in the art wfll recognize tiiat any "granule" size found to be equal in step 412, Ac "granule" has changed and 

ttiat produces the most cfiEdent results (in tenns of process- program control contiimes with step 416 where the "shadow 

ing time and amount of data feat must be badted up) may be file" is updated with the newly-calculated "contents identi- 

utilized. This tcdinique of subdividing files into "granules" fier^ fw the "granule". Program control then continues with 

is only used to reduce the amount of data that must be 5 st^ 418 where the changed "granule" is congiressed into a 

transmitted to the remote backup file servo- 12 and is not compressed storage fiile 32 using a special format that 

utaized in making badoip copies of binaiy objects for identifies the "granule". All changed "granules" for the "data 

storage on local computers 20. stream" currently being processed are packed together in the 

The operation of the Distributed Storage Manager pro- same conqircssed storage file 32, The contents of the com- 

gram 24 in subdividing files into "granules" is illustrated in lo pressed storage file 32 is then transmitted to the remote 

the flow chart depicted in EBG. 5^. This "granularization" backup file server 12 for long-term storage and retrieval If 

procedure is performed fcx *iarge" files following stq) 136 the Distributed Storage Manager program 24 determines tiiat 

of the fl ow chart of FIG. So. Program control begins at step a large pacmtage of the "granules" in the binary object have 

400 whdc the Distributed Storage Manage propm 24 changed (e^^ 80%), &en the entire binary object is backed 

identifies whether tiie binary object currently being pro- 15 up to the remote backup file server 12. 

cesscd is a scgmait of a "large** dat^ase-like file. Program The op^ation of the Distributed Storage Manager ppo- 

control thai continues with step 402 where the Distributed gram 24 in reconstituting, on a local computer 20, a binay 

Storage Manager program 24 detemiines whether this is the objed: that has been transmitted to the remote badmp file 

first time that the binary object currently being processed is server 12 using the "granularization" technique illustrated in 

being backed up using the "granularization" technique. If so, 20 FIG. 5^ is illustrated in the flow chart depicted in EIG. Sh, 

program control continues with step 404 where fee Distrib- Program control begins at step 420 whae the Distributed 

uted Stcffage Manager program 24 creates a "shadow file" Storage Manager program 24 creates a wa& area on the 

which contains a "contents identifier" for eadi "granule" in remote backup file sa\er 12 that is equal in size to the total 

the binary object cucrcnfly being processed. Eadi "contaits unconqpressed size of the binary object that is to be recon- 

identifier" is coir^sed of a standard 32-bit Cyclical Redun- 25 stitutedProgramcontrolcontimieswifli step 422 where the 

dancy Check number which is calculated against the con- most recent complete copy of the bmary object to be 

tents of the "granule" and a 32-hit hash number which is reconstituted is located on tiie remote backup file server 12 

calciilRtH against the contents of the "granule" in tiic same and is decompressed into flie work area. Program control 

manner described in relation to step 138 of the flow chart then continues widi step 424 where the Distributed Stwage 

depicted in FIG. So. Those of ardinary skill in flic art will 30 Manager jH-ogram 24 creates a bitmap with one bit repre- 

rcadily recognize the manner in whidi tiic Cyclical Redun- senting eadi granule of the binary object to be reconstituted, 

dancy Checknumbcr is calculated. Each time that the binary Initially, all bits in this bitmap are set to zoo (0). Each bit 

d)jcct is to be backed up, tiie Distributed Storage Managa in the bitmap is used to indicate whether the granule 

program 24 can calculate the "contents identifier" for eadi associated with ttiat bit has been restored to the most recent 

"granule" in fee binary object and feoa compare it to the 35 con:5)lete copy of the binary object Program control then 

"contents identifier" of the "granule" the last time flie binary continues with step 426 where the Distributed Storage 

object was backed up and determine if the "^^nule" has Manager program 24 locates fee most recent "granularized" 

dianged. This allows fee Distributed Storage Manager pro- copy of fee binary object feat was stored on fee remote 

gram 24 to determine what data within a binary object has backup file sarver 12. Eadi time that step 426 is exeaited, 

changed and only bax^up fee changed data instead of fee 40 fee next most recent "granularized" copy of fee binary 

entire Wnary object Program control feen contmues wife object is located. This process continues until all hits in fee 

step 406 where fee Distributed Storage Manager program 24 bitmap are set to one (1) or until feere are no more 

calculates a "diangc identifier" for each "granule" C3i fee "granularized" copies of fee Innary object feat are newer 

binaiy object and stores it in fee "shadow file" for feat binary fean fee most recent compete coj^ of fee binary object At 

object Program control then continues wife step 408 where 45 feat point, fee binary object wiU have been reconstituted and 

fee binaiy ohgect is coinpressed into a conqjresscd storage will be readty to be restored to fee local computer 20. 

file 32 which becomes fee most recent complete copy of fee FoUowing step 426, program control continues wife step 

binary object for later reconstitution of fee binary object as 428 where the Distributed Stora^ Manager program 24 

is discussed more fully hereinbelow. The contents of fee determines whefeer anofeer "granularized" copy of fee 

conqiressed storage file 32 is feen transmitted to fee remote 50 binary object has been located, ff so, program control 

backup file server 12 for long-term storage and retrieval continues wife step 430 wbsrc fee Distributed Storage 

Program control is feen returned to step 400. Manager program 24 obtains fee list of "granules" in fee 

If fee Distributed Storage Manager program 24 "granularized" copy of fee binary object just located. If 

determines, in step 402, that this is not the first time that fee anofeer "granularized" copy of fee binary object is not 

binary object currently being processed is being backed up 55 located in ^428, program control continues wife step 438 

using fee "granularization" tcdmiquc, program control con- where fee reconstituted binary object is restored to fee local 

tinnes wife step 410 where fee Distributed Storage Manager computer 20. Following step 430, program control continues 

program 24 calculates fee "contents idcntifiot^ for eadi wife step 432 where, starting wife fee first "granule" in fee 

"granule". Program control continues wife step 412 where "granularized" copy of fee binary object, fee Distributed 

each newly-calculated "contents identifier" is con^)ared to 60 Storage Manager program 24 determines whefeer fee bit for 

fee ccnesponding "contents identifier" for fee "granule" in this "granule" in fee bit map is set to zero (0). If fee bit is 
fee "shadow file". If fee two values are equal, program set to one (1), a more recent copy of fee "granule" has 

control continues wife step 414 where fee Distributed Stor- aheady been decon^iressed and copied into fee work area, 

age Manager program 24 detecmines whefeer fee last "gran- If fee bit is set to zero (0), program control continues wife 
ule" of fee binary object has been processed. If so, program 65 step 434 where fee "granule" is decon^ressed and copied 

control is returned to step 400; ofeerwise, program conttol into fee work area at fee correct location for that "granule", 
continues at step 410. If fee "contents identifiers" are not After copying fee "granule" to fee work area, fee Distributed 



04/22/2004, EAST Version: 1.4.1 



5,649, 

17 

Storage Manner program 24 sets tiie bdt within the bitmap 
for the "granule" to one (1). If ftc Distributed Stwagc 
Manager program 24 determines, in step 432, that the bit is 
not set to zm> (OX program control continues with step 440 
where the Etistributed St<xage Manager pxtgram 24 deter- 5 
mines wh^er there arc any more "granules" to be 
cessed in the cuixent set of "granules". If so, program control 
is returned to step 492; other^e, program control is trans- 
ferred to st^ 426. Following step 434, program control 
continues with step 436 where the Distributed Storage lO 
Manager program 24 determines whether all bits in the 
bdtmap are now set to one (1). If so, program control 
continues with step 438 wb^ the reconstituted binary 
object is restored to the local conqmter 20. If tiie Distributed 
Storage Manager program 24 determines, in step 436, that 15 
all bits in the bitm^q) are not set to one (I), program control 
continues with step 440. 

The technique of "granulaiizing" "large" files also 
becomes useful when a current version of a file (conqirised 
of current versions of binary objects) must be restored to a 20 
previous version of that file (con^mscd of previous versions 
of binary objects). Each tnnsffy object conqxrising the current 
version of the file can be restored to the binary object 
conqnising the previous version of the file by restoring and 
updoing only those "granules" of the current version of the 25 
binary objects that are different between the current and 
previous versions of the binary objects. This technique is 
illustrated in tiie flow chart d^ncted in FIG. BL Program 
control begins at step 442 where &e Distnbnted Storage 
Manager program 24 obtains &om the user the identities of 30 
the current and previous versions of the file (comimsed of 
binary objects) which needs to be restored. Program control 
continues with step 443 where the Distributed Storage 
Manager program 24 combes a list of all binary objects 
mmprising the current version of the user-specified file. This 35 
information is obtained firom Hie Database 25. Program 
control then continues with step 444 where the Distributed 
Storage Manager program 24 calculates "contents id^tifi- 
ers" for each "granule" within the current version of each 
binary c^ject as it exists on the local computer 20. Program 40 
control then continues with step 446 where the Distributed 
Storage Manager program 448 transmits an "update requesr 
to the remote backup file server 12 ^^ch indudes the 
Binary Object Identification Record 58 for the previous 
version of each binary object as well as die list of "contents 45 
identifiers" calculated in step 444. Program control contin- 
ues with step 448 where the Distributed Stc^age Manager 
program 24 reconstitutes each previous version of the binary 
objects according to the technique iUustratcd in tlie flow 
chart depicted in FIG. 5k I^x>gram contrd then continues 50 
with step 450 where flie IXsttibutcd Storage Manager pro- 
gram 24, for each binary object, coiiq)ares the "contents 
identifier" of the next "granule" in the work area of remote 
t>ackup file server 12 against the corresponding "contents 
identifier" calculated in step 444. Brogram control continues 55 
with st^ 452 where the Distributed Stc^ge Manager pco- 
gram 24 determines whether the "contents identifiers" 
match. If so, program control is returned to step 450 since 
this "granule" is the same on the local con^uter 20 and on 
the remote badoip file server 12. If the Distributed Storage 60 
Manager program 24 determines, in step 452, that the 
"contents identifiers" do not match, program control con- 
tinues with step 454 where the Ettstributed Storage Managa 
program 24 transmits the "granule" to the local computer 20. 
I¥ogram control then continues with step 456 where the 65 
"granule" received by tiic local conq)Uter 20 is written 
directly to the current version of the binary object at the 
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appropriate location. Pr(^;ram control then ccHitinues with 
step 458 \^ere the Distrxbuted Storage Manager program 24 
determines whether there are any more "granules" to be 
examined for flie binary object cuneutly being processed. If 
so, program control is returned to step 450; otherwise the file 
restore routine is terminated at stq) 460. After all "granules" 
are received from the remote backup file server 12, tiie 
binary object has been restored to the state of the previous 
version. 

5. Auditing and Reporting 

The Di^ributed Storage Manager program 24 is able to 
peifonn self-audits on a periodic basis to ensure that the 
binary objects that have been backed up can be restored. To 
perform an audit, the Distributed Storage Manager ]»x>gram 
24 executes the steps illustrated in the flow chart of FIG. 5/. 
Program control begins at step 500 Mt^ere the Distributed 
Storage Manager program 24 initiates a restore of a ran- 
domly selected binary object identified by a Binary Object 
Identification Record 58 stored in File Database 25. Program 
control continues with step 502 where the selected binary 
object is restored from dlher a compressed storage file 32 
residing on one of the disk drives 19 of one of the local 
con^niters 20 or from the remote backup file server 12. 
Program control then continues with step 504 where, as the 
binary object is being restored, a Binary Object Identifier 74 
is calculated firom the binary object instead of writing die 
binary object to one of the disk drives 19 of one of the local 
con^xiters 20. R-ogram control then continues widi step 506 
where the Distributed Storage Manager program 24 comr 
pares the Binary Object Identifier 74 calculated in step 504 
to the original Binary Object Identifier 74 stcsed as part of 
the randoMy selected Binary Object Identification Record 
58. If the values are equal, program control continues with 
step 508 where the Distributed Storage Manager program 24 
logs a successful audit restcre. If the values are not equal, 
program control continues with step 510 where the Distrib- 
uted Storage Manager program 24 generates an event indi- 
cating an audit failure. 

6. ^rtual Restore 

The disk drives 19 associated with local computers 20 
may have a very large storage c^adty and may requite a 
significant amount of time to be restored, especially if most 
or all of die data must be transmitted from die remote backup 
file server 12. To reduce die amount of time that a local 
computer 20 is "ofBine" during a fnU disk drive 19 restore, 
the Distributed Storage Manage program 24 enq>loys a 
technique which allows a disk drive 19 associated with a 
local con^uter 20 to be only partially restored before bekg 
put back "online" for access by local computer 20. The user 
specifies to tlie Distributed Storage Manager program 24 
that only those files that have been accessed in the last <n> 
days, <n> weeks or <n> months should be restored to the 
disk drive 19 before die disk drive 19 is returned to the 
"online" state. Alternately, the user may specify that only 
files that are stored 'locally" in compressed storage files 32 
should be restored and that no files stored on die remote 
backup file sacvct 12 should be restored before the disk drive 
19 is returned to the **online" state. The overall result is a 
minimization of restore time in the event of disk drive 19 
failure. This "virtual restore" technique generally works 
quite well since users who will begin accessing data on a 
particular disk drive 19 after it is put back "online" will most 
likely only be accessing data that had been **recently" 
accessed before failure of the disk drive 19. 

The 'Virtual restore" process is illustrated in the flow 
chart depicted in FIG. SJL Rrogram control begins at step 600 
where die Distributed Storage Manager program 24 obtains, 
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from the user, the last access date Aat defines yMck files keep the last **m" monthly backup copies of the file AND 

must be restored before the disk drive 19 can be returned to keq) the last "q** quarterly backiq) copies of tiie file AND 

the "online" condition. Any files diat were last accessed on fceq) the last "y" yearly backup copies of the file, 
or after this date will be restored before the disk drive 19 is 

placed "online**. The specification of this date may be 5 By specifying the retention pattern in this way, all backup 

accomplished in any of tiic following ways: ( 1) actual date; copies of a file diat are needed to represent the backup of the 

(2) "within the last <n> days**; (3) •*wdthin the last <n> file as it existed at the time it was backed up for the last "d" 

weeks"; or (4) "within tiic last <n> months". Alternately, the days, the last V** weeks, the last '*m*' months, the last "q** 

user may specify that only files that are currently backed up quarters and Ae last "y** years arc saved. Howcva; by way 

in compressed storage files 32 are to be restwed as opposed lo of example, this may mean that only one backup copy of the 

to files stared on remote backup file server 12, Program file is saved to represent all "d" days (in the case where the 

control continues with step 602 where the Distributed Star- file has not changed in the last "d*' days), or this may mean 

age Mamiger Program 24 locates the most recent version d[ that tiie last "d" daily backup copies of the file must be saved 

the File Database 25 for the disk drive 19 to be restored if to r^xresent the file as it existed for tiie last "d" daily backup 

the Hie Database 25 does not akeaity ecist on die disk drive 15 cydes. The same princqde is utilized for weekly, monthly, 

19. Program control then continues with step 604 inliare the quarteriy and yearly copies. 

next File Identification Record 34 in File Database 25 is The backup file retention scheme utilized by the Distrib- 

read. Program control continues with step 606 where the uted Storage Manager pro-am 24 provides several unique 

Distributed Storage Manager program 24 determines benefits. First, this technique prevents undetected virus or 

whether an additional File Identification Record 34 to be 20 application program damage to a file from destroying all 

processed has been located in File Database 25. If not, good backup copies of a file. If a file is damaged and this 

program contr<^ continues with step 608 where the Distrib- condition is not noticed for several days, then a scheme 

uted Storage Manager program 24 notifies the local com- whidi only maintains the last "n** versions of a file may 

puter 20 that the restored disk drive 19 may be placed result in the situation where an "undamaged** backup copy 

"online" and terminates the virtual restore process. If 25 of the file is not available. The backup file retention scheme 

another File Identification Record 34 has been located for of the present invention allows backup copies of files to be 

processing, program control continues with step 610 where kept that rqiresent the file as it existed at various times 

die Distributed Storage Manager program 24 locates the during ttie past several days, weeks, months or even years, 

most recent Backup Instance Record 42 associated with the Second, the file r^ntion scheme utilized by the Distributed 

File Identification Record 34 currently being processed. In 30 Storage Manager program 24 eliminates the need for most 

step 612, the Distributed Storage Manager program 24 archives. Most archives are designed to take a sn^shot of a 

determines u^ther the Last Access Date/Hme field 52 in group of files as of a certain date, such as at the end <^ each 

the Backup Instance Record 52 indicates tiiat the file has month. The Distributed Storage Manager program*s use of 

been accessed since the user-specified last access date (step retention patterns eliminates the need for users to take 

600). If the file has been accessed on or since the user- 3S periodic snapshots of their data using a special archive, since 

specified last access date, program control continues with the Distributed Storage Manager program 24 handles this 

step 614 v/bcrc the Distributed Storage Manager program 24 automatically. 

initiates the restoration of this file and sets the Migration In order for the Distributed Storage Manager program 24 
Status field 41 in the Hie Identification Record 34 currently to irnplement the backup file retention scheme, each file 
being processed to "NORMAL**. Program control is then 40 stored on the local computers 20 must be associated with a 
returned to step 604. If the Distributed Storage Manager specific retention pattern. Tl^ Management Class fidd43in 
program 24 deteimines, in step 612, diat the file has not been the Hie Identification Record 34 of Hie Database 25 spedr 
accessed on or since the user-spedfied last access date, fies a management class for each file. In turn, each man- 
program control continues with step 616 where the Distrib- agement class is associated with a specific file retenti<m 
uted Storage Manager program 24 sets the Migration Status 45 pattern. In this way, a specific retention pattern is associated 
field 41 in the File Idratification Reovd 34 cuirentiy being with each file. Those of ordinary skill in the art will 
processed to •'MIGRATED*'. In this case, the file does not recognize that other mediods of assigning a spcd&c file 
need to be restored. lYogram control is then returned to step retention pattern to a file may also be utilized. 
604. The operation of the backup file retention scheme utilized 

Anodicx feature of the virtual restore process is die ability 50 by tiie Distributed Storage Manager program24 is illustrated 
to utilize die Migration Status field 41 in File Identification in the flow diart ctf FIG. 51. Program control begins at step 
Record 34 for Ae performance of space management If a 700 where the Distributed Storage Manager program 24 
particular file has not been accessed on or since a user- locates each File Identification Record 34 in the File Data- 
specified last access date, the file can be backed up to the base 25. Program control continues at step 702 where the 
remote backup file servo: 12 and then deleted from die disk 55 Distributed Storage Manager program 24 determines die 
drives 19 associated with local computers 20, Hie Migration required file retention pattern by examining the Management 
Status field 41 is di^ setto **MIGRArEDMf a migrated file Qass field 43 in die Kle Identification Record 34 cmroitiy 
is later needed by a user, die file can be restored from the being processed and then creates a **rctention woridng list**, 
remote backup file server 12. The *Y^ention woridng lisr is a list of entries that specify 
7. Backup FUe Retention €0 the starting and ending dates fcr each backup copy that 

The Distributed Storage Manager program 24 implements should be retained based upon the specified retention pat- 

a backup file retention scheme whereitt a r^ntion pattern is tern. For exanq>le, if the user has specified that the last **d" 

maintained for each individual file that indicates which daily backup copies must be retained, then the *^tention 

backup versions of a file are to be saved. Aretention pattern working lisr will contain "d** entries with the "start date" 

for a file is defined as: 65 equal to die "end date** for each entry and the dates for the 

keep die la^ "d** daily backup copies of die file AND first entries set equal to die current dater the dates for die 

keep the last 'V" weekly backup copies of the file AND second entries set equal to the previous day*s date, etc. For 
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weeldy cntncs, the **rctentioii waidng lisT will contain calculation of said binaiy object identifier being based 

entries (one per weeMy backup cc^y to be regained) with Hic upon the actual data contents of the associated binary 

''start date** set to the date lhat specifies die beginning of the object^ said calculated binaiy object identifier being 

prior week (based on tiie current date) and the **end date" set saved as the name of the associated binaiy object; 

to the date that specifies the end of the prior week (based on 5 means for comparing said current name of a particular 

the cmrent date), ff "w** weeks are to be retained, then 'V" binaiy object to one or more previous names of said 

weekly "retention working list" entries will be created. At binaiy object; 

the end of this process, the "retention working list" will means for storing said cmrent name of said binaiy object; 

contain a list of "windows** which indicates the date ranges and 

that a file must fall within in orta to he raained. means for controlling said means for selectively copying 

Program control continues with stq) 7M where the JXs- objects inresponse to said means for comparing, 

tributed Storage Manager program 24 locates the most 2. The distributed storage managment system of daiml 

recent Backup Instance Record 42 associated with the Hie wherein said means for calculating said current name 

Identification Record 34 currently being processed. Program includes means for calculating a current name OHiqirised of 

control continues with stq) 706 where the Distributed Stor- i^^st two independently dacnnined values, 

age Manager program 24 compares the date stored in the 15 ^ distributed storage management systmrf daim 1 

Insert Date field 57 of the Backup Instance Record 42 wherein said means for calculating said current name 

currently being jyocessed with any "unused" date ranges set indudes means for calculating a 128-bit binaiy vahic comr 

fortii in the "retention working lisf* (if any of the •*retention pnsed of four 32-bit fields 

worlonglist" ent^ have akeady been satisfied^ th^r will be 4 ^he distributed storage management system of daim 3 

"^^.^S^^^^ti^ ^^^^"^^L^^^ S wherein said four 32.bU fields indude a binary object 

^f^tostepim^^odat^ ^m^lns^^chold ^ ^ Cydical Redundancy Chedc nun^ 

57 does not fall withm any of the ^used "retention „ , , 1 1 * j * ^ * 1.. 

working list" date ranges, then program control continues f^^ tfTOf^ ^"^v ^ "^^^f ^ f 

with sto 712 where the Bac^ Instance Record 42 is Loi^fftudinal Redundancy Chedc number field calculated 

dektedOtfacrwise,programcontrolcoDtinues with step 708 ^ agJ^t the contote of the binaiy otj«^^^ 

where all '^retention working list" entries satisfied by the number field calculated against the contents of the 

date stcffedin the Insect Date Kdd 57 are mariked as "used" binary object 

to indicate that a Backup In^ce Record 42 has been used ^- distributed storage management system of daim 1 

to satisfy this entry. This ensures that an older Backup further induding means for auditing the pcifonnancc of said 

Instance Record 42 is not used to satisfy a retention pattern 3Q distpj>utcd storage management system, said means for 

specification when a newer entry also satisfies the condition. auditing induding: 

The Distributed Storage Mana ge r pr o g ram 24 also checks to second means for controlling said means for selectivdy 

ensure that tiie file associated with the Backup Instance copying binaiy objects to rccopy a previously copied 

Recoffd42hasnotbecnddetedpriortothe"enddatc*'of the binary object; 

window satisfied fay the date stared in Insert Date field 57. 33 means for recalculating said binary object Identifier for 

This condition is satisfied by ensurii^ that the date stored in tiie recopied binaiy object, said recalculated binary 

the Delete Date fidd 56 of the Backup Instance Record 42 object identifier being saved as the name of die asso- 

cuircntiy being processed is aiter the "end date" of the dated binaiy object; 

window satisfied by the date stored in Insert Date field 57. means for comparing saidrecalculated name to a previous 

If the file was deleted prior to the "end date" cf the window, ^ name of said binaiy object; and 

then the file cannot be used to satisfy tiie **retention working means for reporting a failure if said recalculated name is 

lisr entry since that file did not exist on &e "end date". not identical to said previous name of said binary 

Following eitiiCT step 708 or stq> 712» program control dbjcd. 

continues witii step 710 where the Distributed Storage 6. The distributed storage management system of daiml 
Manager program 24 determines whether there are any 45 wherein said means for coatroUing said means for sdec- 
additional Backup Instance Records 42 associated with tbt tively cc^)ying indudcs means for instructing, in response to 
Hie Identification Record 34 currently being processed. If said means for oonq)aring, said means for selectivdy copy- 
so, program control is returned to step 704; otherwise, ingtocopy a paiticiiar binary object only if its current name 
program control is returned to step 70#. is not identical to a previous nimet for ^at particular binary 

While file present invention has been described in coih 53 object 
nection with an excirq>laiy embodiment thereof, it will be 7. The distributed storage management system of daim 1 
understood that many modifications and variations will be additionally comprising means for segmenting the binary 
readily apparent to those <^ ordinary skill in the art This objects into granules of data, and wherein said granules of 
disclosure and the following daims are intended to cover all data are processed in the same manner as said binaiy objects, 
sudi modifications and variations. 55 8. The distribated storage management system of daim 7 
We daim: further induding means for reconstmcting a binaiy object 
1. A system for distributed management of the storage from a most recent con^^lete copy of the binary object, said 
space and data on a n^worked computer system \^erein the means for reconstmcting indnddiig: 
netwoiked conyxiter system includes at least two storage means for copying said granules copied by said means for 
devices for storing data files, said distributed storage man- ^ selectivdy copying granules to said most recent oom- 
agement system conqnsing: plete copy cf the bmary object in order from most- 
means for sdectively copying data files stored on one of recently copied granule to least-recently copied gran- 

the storage devices to another of the storage devices; ule; and 

means for dividing eadi data file into one or more binary means for generating a bitmap for controlling said means 

objects of a predetermined size; 65 for copying said copied granules, 

means for calculating a current value for a binary object 9. The distributed storage management system of claim 8 

identifier for each binary object within a file, said wherein said means fc^ calculating said cuirent name for 
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said granule includes means for calculating a 32-bit Cydical 
Redundancy Check number calculated against the contents 
of said granule and means fear calculating a 32-bit binaiy 
object hash number calculated against &e contents of said 
granule. s 

10. The distributed storage management system of claim 
8 further including means for restoring a current version of 
a binary object to a previous version of tiiat binary object, 
said means for restoring including: 
means for calculating said name for eadi of said granules 

in the current version of the binary object; 
means for oon^aring said calculated name to a previous 
name for each of said granules in the current version of 
the binaiy object; and 
means, responsive to said means for conapaiing said 
names, for replacing those granules in the cunent 
version of the binary object for which said names are 
not identical. 

IL The distributed storage management system of daim ^ 

I additionally conqirising; 

means fox indicating whidi of said copied binaiy objects 
must be cq>ied to a partioilar stcvage device in the 
event of a failure of that storage device before &at 
storage device is considered to be operable by the local 25 
computer witti which that storage device is in commu- 
nication; and 

wherdjn said means for controlling said means for selec- 
tively copying binary objects is further responsive to 
said means for indicating. 30 

12. The distributed storage management system of daim 

II wherein said means for indicating indudes means for 
specifying a last access date sudi that only binary objects 
that have been accessed the networked computer system 
on or after said last access date must be copied to aparticular 35 
storage device before that storage device is considered to be 
operable. 

13. Hie distributed storage management system of daim 
1 additionally coni^sing: 

means for maintaining a file retention li^ wherein said file ^ 
retention list includes a file retention pattern for each 
binaiy object copied by said means for selectively 
copying binaiy objects; 

means for d^ennining which of the binary objects copied 
by said means for selectiYely copying binary objects * 
matdi eadi of said file retention patterns; and 

means for ddeting the binaiy objects from the storage 
devices in response to said means for determining. 



14. The distributed steerage management system of daim 
13 wherein said file rttoition pattern indudes daily, weekly, 
monthly, quarterly and ycariy retention patterns. 

15. The distributed storage management system of daim 
1 wherein said networked con^uter system includes a 
remote backup, file server, and wherein said means for 
selectivdy copying copies the binaiy objects stored on one 
of the storage devices to another of the storage devices or to 
the remote backup file server, said distributed storage man- 
agement system additionally, compising: 

means for employing user-defined priorities to determine 
which binary objects are to be copied to another stcffage 
device and to d^ermine a queuing sequence for cc^y- 
ing binary objects to the remote backup file server. 

16. A method for management of the storage space and 
data on a oon^uter syst^ wherein the computer system 
indudes at least two storage areas for storing data files, said 
method comprising the steps of: 

dividing each data file into one or more binary objects of 
a pred^ermined size; 

calculating a cmrent value for a binary object Identifier 
for each binary object within a file, said calculation of 
said binaiy obj ect identifier being based upon the actual 
data cOTtents of the assodated binary object, said 
calculated binary object identifier being saved as the 
name of the assodated binary object; 

comparing said current name of said binary object to one 
or more previous names of said binaiy object; 

storing said current name of said binaiy object as a 
previous name of said binary object; and 

selectively copying binaiy objects in response to said 
coii^>aring step. 

17. The method of claim 16 wherein said step of calcu- 
lating said cunent name for said binary object indudes the 
step of calculating a current name for a binaiy object 
comprised of at least two independently determined values. 

18. The method of claim 16 wherein said step of calcu- 
lating said current name for said binaiy object indudes the 
step of calculating a current name for a binaiy object 
utilizing a 12^bit binary value comprised of four 32-bit 
fields and wherein said four 32-bit fields include a binary 
object identifier size field, a Cyclical Redundancy Check 
number fidd calculated against the contents of the Binaiy 
object, a Longitudinal Redundancy Check number fidd 
calculated against the contents of the binaiy object, and a 
binaiy object hash number field calculated against the ccm- 
tents of the binaiy object 
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